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Abstract. Wireless sensor networks (WSNs) are emerging as an effec- 
tive means for environment monitoring. This paper investigates a strat- 
egy for energy efficient monitoring in WSNs that partitions the sensors 
into covers, and then activates the covers iteratively in a round-robin 
fashion. This approach takes advantage of the overlap created when many 
sensors monitor a single area. Our work builds upon previous work in 
[2], where the model is first formulated. We have designed three approxi- 
mation algorithms for a variation of the SET K-COVER problem, where 
the objective is to partition the sensors into covers such that the number 
of covers that include an area, summed over all areas, is maximized. The 
first algorithm is randomized and partitions the sensors, in expectation, 
within a fraction 1 — ('-^.63) of the optimum. We present two other 
deterministic approximation algorithms. One is a distributed greedy al- 
gorithm with a I approximation ratio and the other is a centralized 
greedy algorithm with a, 1 — - approximation ratio. We show that it is 
NP-Complete to guarantee better than ^ of the optimal coverage, in- 
dicating that all three algorithms perform well with respect to the best 
approximation algorithm possible. Simulations indicate that in practice, 
the deterministic algorithms perform far above their worst case bounds, 
consistently covering more than 72% of what is covered by an optimum 
solution. Simulations also indicate that the increase in longevity is pro- 
portional to the amount of overlap amongst the sensors. The algorithms 
are fast, easy to use, and according to simulations, significantly increase 
the longevity of sensor networks. The randomized algorithm in particular 
seems quite practical. 



1 Introduction 

We study the problem of designing an efficient and distributed algorithm that 
partitions the sensors in a WSN into k covers such that as many areas are 
monitored as frequently as possible. The problem of choosing a cover for each 
sensor is abstracted into a variant of the SET K-COVER problem, in which we 
are given a finite set S of elements, corresponding to the areas to be monitored, a 
collection of subsets of S, where each Sj represents a sensor and contains 

the areas that sensor monitors from S, and a positive integer k > 2. The goal 
is to find a partition of the subsets into k covers ci,...,Cfc where each cover is 
a set of subsets, such that X^jLi I Usjgci Sj\ is maximized. Informally, we are 
maximizing the number of times the areas are covered by the partition. 



The SET K- COVER problem can be used to increase the energy efficiency 
of WSNs. A single area in a WSN may be covered by multiple sensors due 
to the ad hoc nature of sensor placement, topological constraints, or perhaps 
to compensate for the short lifetime of a sensor by placing multiple sensors 
close together. Therefore, in an effort to increase the longevity of the network 
and conserve battery power, it can be beneficial to activate groups of sensors in 
rounds, so that the battery life of a sensor is not wasted on areas that arc already 
monitored by other sensors. In addition, certain batteries last up to twice as long 
when used in short bursts as opposed to continuously [1]. Therefore, activating 
a sensor only once every k time units can extend the lifetime of its battery. 

Previous results on this problem [2] solve a fair version where the objective 
is to maximize k such that every cover contains all the elements. In many en- 
vironments, requiring that a cover contain all the elements may be too strict. 
Consider, for instance, that there is a single area that is monitored by only one 
sensor but all other areas are monitored by hundreds of sensors. Except for that 
single area, all other areas could be covered for much longer by dividing the 
sensors into covers. But in the fair version, we cannot partition the sensors at all 
because only one partition would be able to monitor that one area. Therefore, 
we relax the requirement that every cover contain all the elements. 

We explore three algorithms that solve the SET K-COVER problem: random- 
ized, distributed greedy, and centralized greedy. In the randomized algorithm, 
each sensor simply assigns itself to a cover chosen uniformly at random from 
the set of all possible covers. In the distributed greedy algorithm, each sensor 
assigns itself, in turn, to the cover with the minimum intersection between the 
areas the sensor monitors and the areas monitored by the cover thus far. The 
centralized greedy algorithm is similar to the distributed greedy, except that an 
area in the intersection is weighted based on how likely it is to be covered by 
some other sensor later on in the assignment process. 

The performance of our three algorithms are summarized in Table 1. One 

metric for the performance of our algorithms is the worst case ratio between the 
number of times the areas are covered, according to the algorithm's partition, 
and the optimum number of times the areas can be covered by any partition. 
This ratio is referred to interchangeably as the performance guarantee and the 
approximation ratio. Our simulations show that for high density networks, the 
SET K-COVER partition can simultaneously achieve high k and high coverage 
at each time instant. Simulation results indicate that the increase in longevity is 
a constant function of the density of the network. In Table 1, \E\ is the number 
of sensor-area pairs such that the given sensor covers the given area, and c is 
a scaling factor(perhaps dependent on other problem parameters). The running 
time of an algorithm is the number of time units the sensor network needs to 
create the partition (within the distributed or centralized setting in which the 
algorithm is presented). |S'max| is the cardinality of the largest subset. There is 
no worst case guarantee on fairness for the distributed and centralized greedy 
algorithms. However, in simulations, calculations of the area that is covered by 
the least number of covers, relative to the number of sensors that are capable of 



covering it, suggest the algorithms are fair in practice. 
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Table 1. Summary of Results. 



We find, in accordance with "No Free Lunch Theorems" [11] that there is a 
trade-off between the complexity (both in terms of running time and simplicity) 
and the performance guarantee. The randomized algorithm is remarkably simple, 
robust, easy to use, and easy to code. It is also fair in two respects. 

1. In expectation, an area is covered within 1 — i of the maximum number of 
times possible. 

2. With high probability, the least covered area is covered within Inn of the 
maximum number of times possible 

The randomized algorithm does bears some risk since its approximation ratio is 
an expectation. The distributed greedy algorithm has a deterministic approxima- 
tion ratio, but the ratio is smaller than the ratio for the randomized algorithm, 
and both the running time and the requirements of the network are slightly 
higher. Finally, the centralized greedy algorithm gives a best possible guarantee 
for some variants of the problem, but it may not always possible to design a 
distributed implementation. 

We show that it is NP-Complcte to guarantee better than i| of the opti- 
mal coverage, indicating that all three algorithms perform well with respect to 
the best approximation algorithm possible. The hardness of approximation is 
obtained by a reduction from the E4-SET SPLITTING problem. 



Simulations show that in practice, the algorithms perform well above the 
worst case bounds proved in the theoretical analysis. Many simulations show 
the algorithms covering more than 99% of the maximum possible. 

Simulations also suggest that using the sensors in rounds has the potential to 
significantly increase the longevity of sensor networks. In simulation results, the 
energy savings using the SET K-COVER algorithm are directly proportional to 
the density of the network. Significant increases in the longevity of the network 
arc observed when the overlap between sensors is high. In addition, there is time 
gained by extended battery lifetimes due to operation in short bursts. 

The paper is organized as follows. In sections II, III, and IV respectively, a 
randomized, distributed greedy, and centralized greedy algorithm are presented 
and analyzed. Section V shows the hardness of approximation for the SET K- 
COVER problem. Section VI contains the results of various simulations. We 
conclude with open problems and areas of further exploration. 

2 Randomized Algorithm 

The randomized algorithm assigns each sensor to a cover chosen uniformly at ran- 
dom. It requires no preprocessing and makes extremely few assumptions about 
the network. Its simplicity facilitates implementation, use, and maintenance. It 
is also robust to sensor failure, and can easily accommodate the entry of new 
sensors into the system. In addition, the expected coverage is high, at least 1 — 
of the best coverage possible. This is also true per individual area, so that the 
expected amount an area is covered is proportional to how many sensors are ca- 
pable of monitoring that area. We can attain close to the expected performance 
in practice because the algorithm is simple enough that it can be run many times 
during the lifetime of the sensor network. This reduces the risk that the overall 
performance is far from the average. 
ASSUMPTIONS: 

1. It is assumed that all sensors have clocks with a unified start time to, so that 
operations can be synchronized. 

2. Each sensor has a random number generator. 

The following algorithm partitions the sensors into covers and is executed in 
parallel at each sensor starting from initialization at time t = 0. 

Randomized Algorithm at Sensor j 
Choose a random number i G {1, A;};l 
Assign self to cover cf, 



At the end of the algorithm, sensor j belongs to cover Ci . During the round- 
robin covering of the areas, sensor j will activate itself when cover Ci is active. 

Theorem 1. The expected number of times elements are covered by the random- 
ized algorithm is al — ^ approximation to OPT, where OPT is the best coverage 
possible. 



Proof. For a single area v, we will calculate E[ly], the expected number of covers 
that cover v in our solution. We use to denote the number of subsets that 
contain v. A cover will not contain v with probability (1 — ■^)^" because there 
arc Ny sets to be assigned and each has probability i of being assigned to a 
particular cover. The expected number of covers containing u is /s — k{l — ■^)^'' ■ 
So the total expected number of times areas are covered by the partition is 

Let /* be the number of times v is covered in the optimum solution. Then, 
/* < min(fc, Ny) because an area cannot be covered by more than k covers or by 
more than the number of subsets containing it. The expected number of times 
areas are covered by the algorithm is at least E[ly] and the total covered 

by OPT is at most V minf/c, Ny ). To show the overall fraction J — . > 

(1 — i) we will show that Vu, rnin(fc"lf„) —(-'-" i)- There are two cases. 

I:k<Ny Then, ..^i^ = i _ (i _ > i _ (i _ i). > i _ i. The 

last inequality is due to the power series expansion of e^, which shows that 

-e- 

II: k>Ny 

Ell ] 

We will show that the derivative of the ratio j^j^j-^jv ) with respect to Ny is 
negative, implying that the ratio is smallest when k = Ny. 

d ( k-k(l-ir^ . _ fc[(l-l)^^(l-ln(l-l)^^)-l] 

This is negative iff 

(l + ln(l-i)-^")<(l-i)-^^^ 

which is again true due to the power series expansion (1 + f < e*, t 7^ [9]). 

Another attractive property of the randomized algorithm is that the element 

that is covered least is not covered too much less than the maximum number 
of times that it could possibly be covered. From case I above, in expectation, 
an area is covered within 1 — i of the maximum number of times possible. The 
tails of the distribution over I can also be bounded. More precisely, consider our 
objective is to find a partition of the subsets of S into k covers such that I is 
maximized, where I satisfies & S,l < J2j-vec- ^* ^® optimum value 

of Z. 

Lemma 2. With high probability (greater than 1 — ^ the randomized algorithm 
gives a solution with I > 241^- 

Proof. Let ly be the number of covers area v belongs to after the randomized 
rounding and iV^, be the number of sets containing v. iJ,y = E[ly] 
> (1 — -i) min(fc, Ny) > (1 — i)?*. Each v falls into one of two cases. 

I: Uy < 161nn. Then ly > 1 > > — > ^r^- 

— — — 16 Inn — —^16 Inn — 24 Inn 



II: > 16 In n. Using Chernoff bounds, 

Pr{l. < f X exp{^) < ^ 
Because „ < Pr(ly < „,','' ) < -\ 

24 Inn — 2 ' ^ ^ — 24 In n n'^ 

The probability that a single ly is less than 247571 is less than so the 
probability that any 1^ is less than 247571 is less than ^.j, < ^ due to the 
Bool-Bonferroni Inequalities [9]. Therefore, the probability that the result does 
not have all ly within constant is significantly small, less than i. 

3 Distributed Greedy Algorithm 

The distributed greedy algorithm, in contrast with the randomized algorithm, 
gives a deterministic guarantee that the produced partition covers at least half 
as many areas as the best possible partition. The algorithm makes some assump- 
tions about what the network is able to do and also requires some preprocessing 
steps. 

ASSUMPTIONS: 

1. A clock with a unified start time to, so that operations can be synchronized. 

2. A unique ID number taken from the set of integers j G {1, ...n}. 

3. Knowledge of the parameter k and memory for storing a matrix of size k x \Sj\, 
all entries initialized to 1. 

4. Some way to recognize an area of interest (for instance, geographic coordinates 
or a mapping from unique sensor information to an area identification number) . 

5. Some way to communicate with other sensors that cover a common area 
(preferably in a local manner through direct broadcasting). 



3.1 PREPROCESSING PHASE 

Several preprocessing steps must take place before the partition can be created. 

First, each sensor determines which areas of interest it will be capable of 
monitoring once it is in an activated 'on' state. This can be done using GPS or 
sensor localization which is itself an area of active research, and algorithms to 
achieve this task are described in [5], [3], and [4], among others. 

Next, each sensor must determine a method of communication with other 
sensors covering the areas that it covers, which we will refer to as the sensor's 
neighbors. It may be necessary to commimicatc this information using a broad- 
casting tree [10] or other forms of message routing. We will give a two step dis- 
tributed algorithm for stationary sensors in Euclidean space with no obstacles. 
However, the specific implementation of this task will vary between applications. 

1: Every sensor broadcasts its unique sensor ID number, the areas it monitors, 
and the distances to these areas, to twice the distance of the furthest area 
that it monitors. 



2: Based on information a sensor receives in step 1, from the set of sensors 
with which it knows it shares an area in common, it records the distance 
from the area to the sensor that is furthest away as its dj parameter. If this 
distance is less than the distance of its broadcast in step 1, it instead sets 
its dj parameter to the distance that was used for broadcasting in step 1. 

This process ensures that every sensor node knows the broadcast distance 
necessary so that the other nodes covering a common area can be notified by 
the sensor. The dj distance will be used by the sensor to inform other sensors of 
its decisions during the partition phase. 

3.2 PARTITION PHASE 

In this phase, the sensors are partitioned into covers. The algorithm is initiated 
at time t = 0. 



Distributed Greedy Algorithm at Sensor j 

While t < j 
// message is received that an area v € Sj will be 
monitored by another sensor in cover Ci, 
then change the entry in row i, column v, 
from 1 to 0; 

If t = j 

Choose i £ {1, ...k} such that the sum along row 
i is largest; 

Assign self to cover Ci ; 

Broadcast information about this decision 

to neighbors; 



The above distributed greedy algorithm is simple and requires only nk\Sniax\ 
time. In addition, it is guaranteed to cover more than half of what the optimum 
partition is capable of covering. 

Theorems. The distributed greedy algorithm is a ^ approximation for the SET 
K- COVER Problem. 

Proof. Proof by construction. We will iterate back through the n subsets, cre- 
ating a copy of Sj, called S* at its location c* in OPT. The number of newly 
covered elements by Sj is ct{Sj) and the number of elements covered by Sj 
at the moment it was assigned to q at time t = j will be called a{Sj). Be- 
cause Sj was assigned to Cj, and because a{S*) only decreases by the addi- 
tion of more covers as we are iterating backward, oi{Sj) < cx{Sj). In addition, 
a{S*) + oi{Sj) > OPT since this assignment subsumes the sets assigned 
to their optimal positions. Combining equations, Oi{Sj) > 



4 Centralized Greedy Algorithm 



The centralized greedy algorithm has a better approximation ratio than the 
distributed greedy algorithm, and this ratio is tight for some instances of the 
problem. However, the communication and storage requirements for deploying 
this algorithm in a distributed setting are more involved than the above algo- 
rithms and may vary greatly between applications. We do not propose this as a 
distributed algorithm but instead show that in a centralized setting, the perfor- 
mance of the randomized algorithm can be made into a deterministic guarantee. 
We leave as an open problem the implementation of this algorithm in a dis- 
tributed setting. 

This algorithm is the same as the distributed greedy algorithm except that 

each area is assigned a weight of (1 — Ir)^""^ where is the number of subsets 
containing area v, in the given time step, that have not yet been assigned to a 
cover. Now, instead of summing entries in the rows of the matrix, the matrix is 
multiplied with a,\Sj\xl vector corresponding to the weights of the areas covered 
by the sensor. The sensor is then assigned to the column which is largest in the 
1 X A; vector resulting from the matrix multiplication. Through this process, the 
algorithm chooses a cover c^, for a given subset Sj, that maximizes the weighted 
sum of uncovered elements, ^yveSjAv^Us ec Sj(^ ~ i)^""^' instead of simply 
^vveSj/\v^Us Gc Sj ^ ^ distributed greedy algorithm. This is an intuitive 

algorithm in that each subset is assigned to the cover where it covers the largest 
possible number of uncovered elements, weighted according to how likely it is 
that the element will be covered in future iterations. 



Centralized Greedy Algorithm 
Initialize C — {ci := 0, Cfc := 0}; 
For j := 1 until n 

find i = argmaxi E«.„es,At,^us,ec,s, " l)''""'; 

Ci := Cj U Sj {assign Sj to the cover c,); 

We will prove that this algorithm gives a 1 — ^ approximation ratio by showing 
that the above greedy algorithm is the derandomization of random assignment 
using the method of conditional expectation. 

Theorem 4. The centralized greedy algorithm is a 1 — ^ approximation for the 
SET K- COVER Problem. 

Proof. We would like to show that at each decision, the conditional expectation, 
given that decision, is greater than the expectation before being conditioned on 
that decision. Suppose we are at the step where we are assigning subset 5^ . We 
want to assign Sj to a cover such that the expected number of areas covered, 
conditioned on having assigned to cover Cj, is maximized. More precisely, if we 
denote by ajt the assignment of subset j (in iteration j) to cover c, and by Pa 
all subset-cover assignments from previous rounds, we want to choose i that 



maximizes E[ly\pa Aaji]. Because we maximize at every step, by linearity of 
expectation, the conditional expectation cannot decrease. Therefore, at the end 
of the algorithm, we have an assignment for which the objective function is at 
least expected initial vahic [6]. 

The subset Sj will only effect E[ly\pa A aji] if it contains area v so we will 
ignore vertices not in Sj in our decision. Suppose an area v that is in subset 5^ 
is covered in exactly x covers before the assignment of Sj. Then the expected 
number of times v will be covered is E[ly] = k — {k — x){l — ^)^''. Regardless 
of where Sj is placed, will decrease by 1. If v is newly covered in some cover, 
X will increase by 1, otherwise x will remained unchanged. Let us consider both 
scenarios: 

I: Element v is not newly covered by Sj in the assignment aji. Then, 

E[lv\pa A aji] =k-{k-x){l- ^)y--^ 
II: Element v is newly covered by Sj in the assignment aji. Then, 

E[h\pa A aji] = k-{k-x- 1)(1 - i)^"-! 

The component of the conditional expectation that our choice of assignment 
affects is whether or not an element falls into scenario I or II. If it is in scenario 
II, the profit is the last term of the above equation, (1 — \)^''^^. So we want 
to maximize X^weS aij^Us e S ~ s)^"^^- This results in the above greedy 
algorithm. 

Wc now have an algorithm that dctcrministically performs as well as the 
expected performance of the randomized algorithm. 

5 Hardness of Approximation 

For specific cases, our algorithm is tight. In particular, SET K-COVER is a 
generalization of the E4-SET SPLITTING problem, and it is NP-hard to design 
an approximation algorithm for E4-SET SPLITTING that performs better than 
our algorithm. We will first show a weaker statement, that the general case 
cannot be approximated to better than i|. We will begin with some necessary 
definitions. 

Definition 5. In the E4-SET SPLITTING problem we are given a ground set 

V and a number of sets Ri dV each of size exactly 4. Find a partition Vi , V2 of 

V to maximize the number of i with both i?j n Vi and i?j fl V2 nonempty. 

The hardness of approximation for E4-SET SPLITTING has been well stud- 
ied, leading to the following result using PCP [8] . 



Theorem 6. It is NP-hard to distinguish between instances of Max E4-SET 
SPLITTING where all the sets can be split by some partition and those where 
any partition splits at most a fraction | + e 0/ the sets, for any e > 0. 

We use the above definitions to show the hardness of SET K-COVER. 

Theorem 7. It is NP-Complete to a-approximate the SET K-COVER problem 
with a > yI + e for any e > . 

Proof. Given an approximation algorithm A for the SET K-COVER problem, 
we could use it to approximate E4-SET SPLITTING. Suppose we would like to 

approximate an instance / of the E4-SET SPLITTING problem. We can create 
an instance /' of the SET 2-COVER problem. For every variable of the ground 
set V in I, there is a subset in I'. For every set Ri C V in I, there is an 
element in the set S" of /'. A subset in problem /' contains an element of S iff 
the corresponding variable from V belonged to the corresponding set Ri. The 
proof is by contradiction. Assume a = ^ + e for some e > 0. 

Case 1: AH the sets can be split in /. Then the optimum in /' is 2|S'| and 
we run algorithm A on J' and are guaranteed to cover at least (^1 + e)2|S'| = 

+2e)|5'| elements. 

Case 2: Only a fraction | + e of the sets can be split in /. Then the optimum 
in /' is less than (-^ + e)|S'|, and any solution to I' will be less than this value. 

Therefore, we could use A to distinguish between instances of / that can be 
split completely and instances where only a fraction | + e of the sets can be 
split, which would contradict Theorem 4. 

In fact, after more precise analysis of the centralized greedy algorithm in the 
context of E4-SET SPLITTING, we see that the algorithm achieves an approx- 
imation ratio of exactly and is therefore tight. 

Theorems. The centralized greedy algorithm is the best approximation algo- 
rithm possible for specific instances of the SET K-COVER problem. 

Proof. Consider instances where the number of covers is fc = 2 and every area 
is contained in exactly 4 subsets, implying Ny = 4, \/v. From the proof of 

Theorem 1, the approximation ratio is rni^(fc"lf„) ~ ^^^fc — ^ ~ it' 

The centralized greedy algorithm is therefore the best approximation possible 
when we constrain the parameters k and Ny. 



6 Simulation Results 

We performed simulations using all three algorithms. Problem instances were 
generated by setting parameters 15*1 (number of areas), n (number of subsets), 
and \E\ (number of edges). Then a bipartite graph is created, where the edges 
are chosen uniformly at random from all possible subset-area pairs. A subset 



is then considered to contain an area if it has an edge connecting it with that 
area. For each set of parameters, ten problem instances were generated and the 
numbers in the tables below are the average result over all ten instances. 

We chose this approach as opposed to an approach where areas are points in 
Euclidean space and sensors sense within a radius of their location (as in [2]) 
because the latter limits the variety of applications. For instance, consider the 
sensors are embedded in vehicles, animals, or robots that are moving around in 
some physical space, then the set of problem instances are much richer and our 
test scenarios capture this richness of possible applications. 

6.1 Performance Compared to the Optimum 

Simulations show that in practice, when compared to the optimum, our algo- 
rithms perform better than their worst case bounds. We bound the optimum 
by noting that the objective function of the optimum partition cannot be larger 
than k* \S\, since wc can cover at most all the areas in all covers. We can also not 
hope to achieve more coverage than there are edges. Thus we have two possible 
upper bounds for the optimum objective function that are listed in the column 
labeled OPT bound. 



n 


\E\ 


OPT bound 


Random 


Distributed 

Greedy 


Centralized 

Greedy 


1000 


5000 


5000 


3950 


4837 


4832 


1000 


10000 


10000 


6330 


7625 


7647 


1000 


20000 


10000 


8655 


9677 


9727 


500 


5000 


5000 


3951 


4626 


4628 


500 


10000 


10000 


6305 


7277 


7296 


500 


20000 


10000 


8640 


9443 


9470 


2000 


5000 


5000 


3961 


4953 


4954 


2000 


10000 


10000 


6345 


8047 


8068 


2000 


20000 


10000 


8665 


9908 


9959 



Table 2. For these simulations, \S\ = 1000 and k = 10. 



Simulations indicate that the deterministic greedy algorithm achieves perfor- 
mance that is on the order of 10-20% better than the randomized algorithm. The 
performance of the deterministic and centralized greedy solutions are strikingly 
close, differing by less than 1% in every instance of the problem that was tested. 

We see the randomized algorithm is consistent with theoretical analysis, with 
the worst performance achieving 63% coverage, which is quite close to the anal- 
ysis of 1 - i. 

Both deterministic algorithms perform significantly above their worst case 
bounds, with the lowest ratio covering more than 72% of the maximum possible. 



Many instances perform even higher, with four instances acheiving higher than 
99% of the maximum possible. 

6.2 Increased Network Longevity 

Our simulations used the SET K-COVER algorithms to partition the sensors 

into k covers such that when wc rotate among the k covers, more than 80% 
of the areas are covered within the sliding window of k previous time steps. 
Specifically, we maximize k such that the total coverage is more than .8kn. 
Since every set belongs to some cover, every area is covered at least once every 
k time steps. The lifetime of our solution is compared with the straightforward 
approach of activating all the sensors every time step until the percent covereage 
over the previous k time steps drops below 80%. We assume that all sensors have 
the same amount of power initially, that their energy depletes at the same rate, 
and that they are all capable of lasting for several time steps. Therefore, if the 
SET K-COVER can achieve the specified goal of 80%, then this signifies the 
lifetime of the network is more than k — 1 times longer than the lifetime when 
the straightforward approach is used. Because we only require information from 
80% of the nodes on average, this approach is most valuable for WSNs where it 
is not necessary to collect information from all the data in every time step. In 
a WSN where network longevity is of primary importance this approach uses k 
times less energy to collect the required information. 

Our simulations try several values of k, which is difficult to do in a distributed 
setting. However, it is possible to find a good value for k in advance through 
simulations or mathematical properties of k. WSN designers can choose k such 
that, in expectation, the solution has the desired properties. Alternatively, run- 
ning simulations in advance allows designers to make a good choice for the value 
of k ahead of time. 

Our simulations show a significant increase in the lifetime of a network that 
uses the SET K-COVER solution. In Figure 1, the value of k is plotted for 
problem instances with varying density. For all three algorithms, the increase in 
longevity is proportional to the amount of connectivity. This is expected, since 
a highly connected graph has more overlap and therefore more redundancy that 
the SET K-COVER approach can utilize to increase the lifetime of the network. 
This relationship between connectivity and energy savings is reflected in the 
simulation results. 

In addition to the clear benefits in energy savings, the covers produced by 
our algorithms have the useful property that they result in coverage of an area 
that is positively correlated with the number of sensors covering that area. This 
means that if there is a particular area in need of more frequent monitoring, 
then multiple sensors could be deployed close together to bolster the monitoring 
capabilities in that area. For example, if we are monitoring traffic, we might 
want frequent coverage of a busy highway intersection, and have less need for 
vigilant sensor information about an empty country road. Figure 2 charts 200 
elements for a single problem instance, with Ny plotted along the domain and 
% plotted along the range. The randomized algorithm was applied 100 times on 




the same problem instance and the results in Figure 2 are the average over all 
of these runs. The optimum equals min{^, 1) because an area cannot belong 
to more covers than the number of subsets containing it. In the distributed and 
centralized greedy algorithms, no ly has a value that is less than 50% of the 
optimum ly it could possibly obtain. In the randomized algorithm, the worst 
ratio occurs when k = 10 and the ratio is .63 « 1 — i in accord with theoretical 

e 

analysis. We see that on average, the ly values are within 70% to 80% of the 
optimum. These simulations suggest that the algorithms are fair in that every 
area receives coverage relative to the number of sets that cover the area. 

Another convenient property of the greedy algorithms is that for all covers 
in a given solution, the number of areas covered by each cover lies within a 
small range. Thus, wc could use the SET K-COVER partitions if we had the 
requirement that every cover monitor at least 80% of the areas. In Figure 3 we 
graph the size of the minimum cover divided by the size of the maximum cover, 
over several problem parameters. We sec that for the distributed and centralized 
greedy algorithms, the smallest cover is always at least 60% of the largest cover. 
When there are many covers (as in the problems with \E\ = 2000), the ratio 
decreases slightly since it is more likely to have outliers when the group is larger. 
When there are many covers in the randomized algorithm, however, there are a 
few covers with no areas at all. As the number of covers increases, the probability 
there will be a cover with little or no areas becomes larger, leading to the fast 
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Fig. 2. For this problem instance, k = 10, l^l = 200,n = 100, and \E\ = 2000. 



dropoff we observe in Figure 3. Therefore, the distributed and centralized greedy 
algorithms are better suited for applications that require covers that lie within 
a close range of coverage. 

7 Open Problems 

It is an interesting area of further research to determine whether the centralized 
greedy algorithm can be efficiently implemented in a distributed fashion. The 
main challenge in making this algorithm distributed is that it is not clear where 
the Uy values that determine the weights should be stored and how their values 
are to be updated in every round. One possible solution is to run the prepro- 
cessing phase between every sensor assignment, but this significantly increases 
the communication overhead. 

Prom a theoretical perspective, this work raises the question of whether the 
centralized greedy algorithm is tight for the general case when A'^^,. arc non- 
uniform and k > 2. Perhaps the recent breakthroughs in lower bound results 
using PGP [7] [8] can be applied to the SET K-COVER problem. 

Another open area of further study is the design of approximation algorithms 
for fair versions of the problem. The approach in [2] is to design an algorithm 
that maximizes k, such that all areas are included in every cover. We examined 
a flipped variant of the problem in section II, where, given a value of k, the 
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Fig. 3. Cover Size Ranges. 



fewest number of times any element is covered is maximized. In the optimum, 
the second problem subsumes the first since, by doing a binary search on k 
and choosing the largest k for which k = I, we have found the solution to the 
first problem. However, in a distributed sensor network environment, it is very 
difficult to try many possible values of k. More work needs to be done to give a 
deeper understanding of the implications of using either method. 

Finally, it would be interesting to explore how to place sensors in a way that 
works well in conjunction with round-robin covering. 
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